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ABSTRACT 


B is a computer language designed by D. M. Ritchie and K. L. 
Thompson, for primarily non-numeric applications such as system 
programming. These typically involve complex logical decision—- 
making, and processing or integers, characters, and bit strings. 
On the H6070 TSS system, B programs are usually much easier to 
write and understand than assembly language programs, and object 
code efficiency is almost as good. Implementation of simple TSS 


subsystems is an especially appropriate use for B. 


This technical report contains a description of the MH-TSS 
(Honeywell 6070) version of B (by S. C. Johnson), and a tutorial 
introduction to most of the features of the language (by B. W. 


Kernighan). 


A TUTORIAL INTRODUCTION TO THE LANGUAGE B 
by 


B. W. Kernighan 
Bell Laboratories 
Murray Hill, New Jersey 


4. Introduction 


B is a new computer language designed and implemented at Murray 
Hill. It runs and is actively supported and documented on the 
H6070 TSS system at Murray Hill. 


B is particularly suited for non-numeric computations, typified 
by system programming. These usually involve many complex logi- 
cal decisions, computations on integers and fields of words, 
especially characters and bit strings, and no floating point. B 
programs for such operations are substantially easier to write 
and understand than GMAP programs. The generated code is quite 
good. Implementation of simple TSS subsystems is an especially 
good use for B. 


B is reminiscent of BCPL[2], for those who can remember. The 
Original design and implementation are the work of K. L. Thompson 
and D. M. Ritchie; their original 6070 version has been. substan- 
tially improved by S. C. Johnson, who also wrote the runtime 
library. 


This memo is a tutorial to make learning B as painless as possi- 
ble. Most of the features of the language are mentioned in pass~ 
ing, but only the most important are stressed. Users who would 
like the full story should consult A User’s Reference to B on 
MH-TSS , by S. C. Johnson [1], which should be read for details 
anyway. 


We will assume that the user is familiar with the mysteries of 
TSS, such as creating files, text editing, and the like, and has 
programmed in some language before. 


Throughout, the symbolism (->n) implies that a topic will be 
expanded upon in section n of this manual. Appendix E is an 
index to the contents. 


2. General Layout of B Programs 


main() { 
-- statements -- 


newfunc(arg1,arg2) { 
-- statements -- 


} 


fun3(arg) { 
-—- more statements -- 


All B programs consist of one or more "functions , which are 
similar to the functions and subroutines of a Fortran program, or 
the procedures of PL/I. main is such a function, and in fact all 
B programs must have a main. Execution of the program begins at 
the first statement of main, and usually ends at the last. main 
will usually invoke other functions to perform its job, some com- 
ing from the same program, and others from libraries. As in For- 
tran, one method of communicating data between functions is by 
‘arguments. The parentheses following the function name surround 
the argument list; here main is a function of no arguments, indi- 
cated by (). The {} enclose the statements of the function. 


Functions are totally independent as far as the compiler is con- 
cerned, and main need not be the first, although execution starts 
with it. This program has two other functions: newfunc has two 
arguments, and fun3 has one. Each function consists of one or 
more statements which express operations on variables and 
transfer of control. Functions may be used recursively at little 
cost in time or space (->30). 


Most statements contain expressions, which in turn are made up of 
operators, names, and constants, with parentheses to alter the 
order of evaluation. B has a particularly rich set of operators 
and expressions, and relatively few statement types. 


The format of B programs is quite free; we will strive for a uni- 
form style as a good programming practice. Statements can be 
broken into more than one line at any reasonable point, i.e., 
between names, Operators, and constants. Conversely, more than 
‘one statement can occur on one line. Statements are separated by 
semicolons. A group of statements may be grouped together and 
treated as a single statement by enclosing them in {}; in this 
case, no semicolon is used after the ‘}’. This convention is 
used to lump together the statements of a function. 


3. Compiling and Running B Programs 


(a) SYSTEM? filsys cf bsource,b/1,100/ 

(b) SYSTEM? filsys cf bhs,b/6,6/,m/r/ 

(c) SYSTEM? [create source program with any text editor] 
(a) SYSTEM? ./bj (w) bsource bhs 

(e) SYSTEM? /bhs 


(a) and (b) need only be done once; they create file space for 
the source program and the loaded program. (ad) compiles the 
source program from bsource, and after many machinations, creates 
a program in bhs. (e) runs that program. ./bj is more fully 
described in [1]. 


You may from time to time get error messages from the B compiler. 
These look like either of 


ee n 
ee name n 


where ee is a two-character error message, and n is the approxi- 
mate source line number of the error. (->Appendix A) 


4. A Working B Program; Variables 


main(){ 
auto a, b, Cc, sum; 


ae ty) bee. 23 C= 3% 
sum = a + b + c3 
putnumb(sum) ; 


} 


This simple program adds three numbers, and prints the sum. It 
illustrates the use of variables and constants, declarations, a 
bit of arithmetic, and a function call. Let us discuss them in 
reverse order. 


putnumb is a library function of one argument, which will print a 
number on the terminal (unless some other destination is speci- 
fied (->28)). The code for a version of putnumb can be found in 
(->30). Notice that a function is invoked by naming it, followed 
by the argument(s) in parentheses. There is no CALL statement as 
in Fortran. 


The arithmetic and the assignment statements are much the Same as 
in Fortran, except for the semicolons. Notice that we can put 
several on a line if we want. Conversely, we can split a state- 
ment among lines if it seems desirable - the split may be between 
any of the operators or variables, but not in the middle of a 
variable name. 
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One major difference between B and Fortran is that B is a. 
typeless language: there is no analog in B of the Fortran IJKLMN 
convention. Thus a, b, c, and sum are all 36-bit quantities, and. 
arithmetic operations are integer. This is discussed at length 
in the next section (->5). 


Variable names have one to eight ascii characters, chosen from 
A-Z, a-Z, «, _,» O-9, and start with a non-digit. Stylistically, 
it’s much better to use only a single case (upper or lower) and 
give functions and external variables (->7) names that are unique 
in the first six characters. (Function and external variable 
names are used by batch GMAP, which is limited to six character 
single-case identifiers. ) 


The statement “auto ...” is a declaration, that is, it defines 
the variables to be used within the function. auto in particular 
is used to declare local variables, variables which exist only 
within their own function (main in this case). Variables may be 
distributed among auto declarations arbitrarily, but all declara- 
tions must precede executable statements. All variables must be 
declared, as one of auto, extrn (->7), or implicitly as function 
arguments (~>8). 


auto variables are different from Fortran local variables in one 
important respect - they appear when the function is entered, and 
disappear when it is left. They cannot be initialized at compile 
time, and they have no memory from one call to the next. 


5. Arithmetic; Octal Numbers 


— 


All arithmetic in B is integer, unless special functions are 
written. There is no equivalent of the Fortran IJKLMN conven— 
tion, no floating point, no data types, no type conversions, and 
no type checking. Users of double-precision complex will have to 
fend for themselves. (But see ~—>32 for how to call Fortran 
subroutines, ) 


The arithmetic operators are the usual ‘+’, ‘-’, “°*’, and °/’ 
(integer division). In addition we have the remainder operator 


°Y%s 
xX = aSb 


sets x to the remainder after a is divided by b (oth of which 
should be positive). 


Since B is often used for system programming and bit- 
manipulation, octal numbers are an important part of the 
language. The syntax of B says that any number that begins with 
O is an octal number (and hence can’t have any 8’s or 9’s in it). 
Thus 0777 is an octal constant, with decimal value 511. 


Ss Bore OS nk Be Sh cog bo Pe etn eee 


6. Characters; putchar; Newline 





main(){ 
auto a; 
as ‘hil’; 
putchar(a); 
pucchan tad 


This program prints “hil” on the terminal; it illustrates the use 
of character constants and variables. A character is one to 
four ascii characters, enclosed in single quotes. The characters 
are stored in a single machine word, right-justified and zero- 
filled. We used a variable to hold hi! , but could have used a 
constant directly as well. . 


The sequence "tn" is B jargon for “newline character’, which, 
when printed, skips the terminal to the beginning of the next 
line. No output occurs until putchar,encounters a *n or the 
program terminates. Omission of the *n , a common error, causes 
all output to appear on one line, which is usually not the | 
desired result. There are several other escapes like “*n for 
representing hard-to-get characters (-—>Appendix B). Notice that 
#n" is a single character: putchar(’hil*n’) prints four charac- 
ters (the maximum with a single call). 


Since B is a typeless language, arithmetic on characters is quite 
legal, and even makes sense sometimes: 


Cc a8 ct+’A’—-%a’ 


converts a single character stored in c to upper case (making use 
of the fact that corresponding ascii letters are a fixed distance 
apart). 


7. External Variables 


main(){ 
extrn a,b, 
ee 


C5 
3 


putchar(b); putchar(c); putchar(’!#n’); 


This example illustrates external variables, variables which are 
rather like Fortran COMMON, in that they exist external to all 
functions, and are (potentially) available to all functions. Any 
function that wishes to access an external variable must contain 
an extrn declaration for it. Furthermore, we must define all 
external variables outside any function, For our example 





en 


variables a, b, and c, this is done in the last three lines. 


External storage is static, and always remains in existence, so 
it can be initialized. (Conversely, auto storage appears 
whenever the function is invoked, and then disappears when the 
function is left. auto variables of the same name in different 
functions are unrelated.) Initialization is done as shown: the 
initial value appears after the name, as a constant. We have 
shown character constants; other forms are also possible. Exter- 
nal variables are initialized to zero if not set explicitly. 


8. Functions 


main(){ 
extrn a,b,c,d; 
put2char(a,b); 
eon 


put2char(x,y) { 
putchar(x); 
oes 


a ‘hell’; b ’o, w’3 ¢ ‘orld ’3; a “J#n’; 


This example does the same thing as the last, but uses a separate 
(and quite artificial) function of two arguments, put2char. The 
dummy names x and y refer to the arguments throughout put¢char; 
they are not declared in an auto statement. We could equally 
well have made a,b,c,d auto variables, or have called putzchar as 


put2char(‘’hell’,’o, w’); 
etc. 


Execution of put2char terminates when it runs into the closing 
‘}’, just as in main; control returns to the next statement of 
the calling program. 


9. Input; getchar 


main() { 
auto c; 
¢ = getchar(); 
putchar(c); 


getchar is another library 10 function. It fetches one character 
from the input line each time it is called, and returns that 
character, right adjusted and zero filled, as the value of the 


ey ae 


function. The actual implementation of getchar is to read an 
entire line, and then hand out the characters one at a time. 
After the ’*n’ at the end of the line has been returned, the next 
call to getchar will cause an entire new line to be read, and its 
(ey returned. Input is generally from the terminal 
->28). 


40. Relational Operators 


equal to (".EQ." to Fortraners) 
not equal to 

greater than 

less than 

greater than or equal to 

less than or equal to 


NV AVM Il 


These are the relational operators; we will use ‘I=’ ("not equal 

to) in the next example. Don’t forget that the equality test is 
‘s=’s using just one ’=" leads to disaster of one sort or anoth- 

er. Note: all comparisons are arithmetic, not logical. 


1. if; goto; Labels; Comments 


main(){ 
auto c¢c; 
read: 
¢ = getchar(); 
putchar(c); 
if(c I= ’*n’) goto read; /* loop if not a newline */ 


This function reads and writes one line. It illustrates several 
new things. First and easiest is the comment, which is anything 
enclosed in /*...*/, just as in PL/I. 


read is a label -— a name followed by a colon °:’, also just as in 
PL/I. A statement can have several labels if desired. (Good 
programming practice dictates using few labels, so later examples 
will strive to get rid of them.) The goto references a label in 
the same function. 





The if statement is one of four conditionals in B. (while (->14), 


switch (->26), and "?:" (->16) are the others.) Its general form 
is 


if (expression) statement 
The condition to be tested is any expression enclosed in 


parentheses. It is followed by a statement. The expression is 
evaluated, and if its value is non-zero, the statement is 


2B: = 


executed. One of the nice things about B is that the statement 
can be made arbitrarily complex (although we’ve used very simple 
ones here) by enclosing simple statements in {}. (->12) 


42. Compound Statements 


if (a <b) { 
as 
b; 
t; 


| i 


“Om ct 


As a simple example of compound statements, suppose we want to 
ensure that a exceeds b, as part of a sort routine. The inter- 
change of a and b takes three statements in B; they are grouped 
together by {} in the example above. 


As a general rule in B, anywhere you can use a simple statement, 
you can use any compound statement, which is just a number of 
simple or compound ones enclosed in {}. Compound statements are 
not followed by semicolons. 


The ability to replace single statements by complex ones at will 
is one thing that makes B much more pleasant to use than Fortran. 
Logic which requires several GOTO’s and labels in Fortran can be 
done in a simple clear natural way using the compound statements 
of B. 


13. Function Nesting 


— 


read: 
c= putchar (getchar() ); 
if (c != ‘*n") goto read; 


Functions which return values (as all do, potentially -—>24) can 
be used anywhere you might normally use an expression. Thus we 
can use getchar as the argument of putchar, nesting the func— 
tions. getchar returns exactly the character that putchar needs 
as an argument. putchar also passes its argument back as a 
value, so we assign this to c for later reference. 


Even better, if c won’t be needed later, we can say 


read: 
if (putchar(getchar()) != ’*n’) goto read; 


14, While statement; Assignment within an Expression 





while ((c=sgetchar()) != ’#n’) putchar(c); 


Avoiding goto’s is frequently good practice, and here is a simi- 
lar ee ecient goto’s. This has almost the same meaning as 
the previous one. (The difference? This one doesn *t print the 
terminating ‘’*n’.) The while statement is basically a loop state- 
ment, whose general form is 


while (expression) statement 


As in the if statement, the expression and the statement can both 
be arbitrarily complicated, although we haven’t seen that yet. 
The action of while is 


(a) evaluate the expression 
(b) if its value is true (i.e., not zero), do the 
statement and return to (a) 


Our example gets the character and assigns it to c, and tests to 
see if it’s a ’*n’. If it isn’t, it does the statement part of 
the while, a putchar, and then repeats. 


Notice that we used an assignment statement "c=getchar()” within 
an expression. This is a handy notational shortcut, usually pro- 
duces better code, and is sometimes necessary if a statement is 
to be compressed. This works because an assignment statement has 
a value, just as any other expression does. This also implies 
that we can successfully use multiple assignments like 


x=yoe= f(a)+g(b) 
As an aside, the extra pair of parentheses in the assignment 
statement within the conditional were really necessary: if we had 
said 

c¢ = getchar() I= ‘*n’ 
c would be set to 0 or 1 depending on whether the character- 
fetched was a newline or not. This is because the binding 


strength of the assignment operator ‘=’ is lower than the rela- 
tional operator ‘I=’. (->Appendix D) 


45. More on Whiles; Null Statement; exit 


main() { 
auto C;3 
while (1){ 
while ( (c=getchar()) I= ° °) 
if (putchar(c) == ‘*n‘’) exit(); 


putchar(’*n’ )3 


aao: = 


while ( (c=getchar()) == ° °°); /* skip blanks */ 
if (putchar(c)==’*n’) exit(); /* done when newline */ 


} 


This terse example reads one line from the terminal, and prints 
each non-blank string of characters on a separate line: one or 
more blanks are converted into a single newline. 


Line four fetches a character, and if it is non-blank, prints it, 
and uses the library function exit to terminate execution if its 
a newline. Note the use of the assignment within an expression 
to save the value of c. 


Line seven skips over multiple blanks. It fetches a character, 
and if it’s blank, does the statement °’;’. This is called a null 
statement - it really does nothing. Line seven is just a very 
tight loop, with all the work done in the expression part of the 
while clause. 


The slightly odd-looking construction 


while(1){ 


} 
is a way to get an infinite loop. "4" is always non-zero, so the 
statement part (compound here!) will always be done. The only 
way to get out of the loop is to execute a goto (->11) or break 


(—>26), or return from the function with a return statement 
(->24), or terminate execution by exit. 


16. Else Clause; Conditional Expressions 


— 


if (expression) statement! else statement2 


is the most general form of the if - the else part is sometimes 
useful. The canonical example sets x to the minimum of a and b: 


if (a<b) x=a; else x=b; 
tf’s and else’s can be used to construct logic that branches one 


of several ways, and then rejoins, a common programming struc— 
ture, in this way: 


if(---) 
aaa 

else if(---—) 
{ Seen 


etc. 


The conditions are tested in order, and exactly one block is exe- 
cuted - the first one whose if is satisfied. When this block is 
finished, the next statement “executed is the one after the last 
else if block. 


B provides an alternate form of conditional which is often more 
concise and efficient. It is called the "conditional expression’ 
because it is a conditional which actually has a value and can be 
used anywhere an expression can. The last example can best be 
expressed as 


x = acb ? as: b; 


The meaning is: evaluate the expression to the left of °?’. If it 
is true, evaluate the expression between ‘?’ and ‘:° and assign 
that value to x. If it’s false, evaluate the expression to the 


right of ‘:’ and assign its value to x. 


17. Unary Operators 


auto k; 
k = 0; 
while (getchar() != ‘#n’) ++k; 


B has several interesting unary operators in addition to the 


’ 


traditional ‘-’. The program above uses the increment Opeg ator 
‘++’ to count the, number of characters in an input line. 
is equivalent to "k=k+1" but generates better code. “++! ey 
also be used after a variable, as in 

k++ 
The only difference is this. Suppose k is 5. Then 

x = t+k 
sets k to 6 and then sets x to k, i.@., to 6. But 

x = Ktt 


sets x to 5, and then k to 6. 


Similarly, the unary decrement operator ‘--’ can be used either 
before or after a variable to decrement by one, 
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“a 
uo 


b J 

((i = i-9) >= 0) c = c | getchar( )<<i; 

B has several operators for logical bit-manipulation., The exam-— 
ple above illustrates two, the OR ‘|’, and the left shift ’<<’. 
This code packs the next four 9-bit characters from the input 
into the single 36-bit word c, by successively left-shifting the 
character by i bit positions to the correct position and OR-ing 
it into the word. 


The other logical operators are AND ’&’ logical right shift 


‘>>’, Exclusive OR ,» and 1’s complement : 
x = ~y&0777 
sets x to the last 9 bits of the 1’s complement of y, and 
x = x&077 
replaces x by its last six bits. 
The order of evaluation of unparenthesized expressions is deter- 


mined by the relative binding strengths of the operators involved 
(—>Appendix D). 


49. Assignment Operators 


An | unusual feature of B is that the normal binary operators like 
“+’, °-", etc. can be combined with the assignment operator = 
to form new assignment operators. For example, 

x =- 10 
means "x=x-10". ‘=~’ is an assignment operator. This convention 
is a useful notational shorthand, and usually makes it possible 
for B to generate better code. But the spaces around the opera- 
tor are critical! For instance, 

x = -10 
sets x to -10, while 

x =- 10 
subtracts 10 from x. When no space is present, 


x=-10 


also decreases x by 10. This is quite contrary to the experience 


of most programmers. 
Because assignment operators have lower binding strength than. any 
other operator, the order of evaluation should be watched 
carefully: 
x = xX<ylz 
means “shift x left y places, then OR in z, and store in x". But 
x =<<< ylz 
means "shift x left by y!z places", which is rather different. 


The character-packing example can now be written better as 


Cc = 0; 
i= 365 
while ((4 =- 9) >= 0) c =} getchar()<<i; 


20. Vectors 


A vector is a set of consecutive words, somewhat like a one- 
dimensional array in Fortran. The declaration 


auto v[10]; 


allocates 11 consecutive words, v[{0], v[1],..., v[10] for v. 
Reference to the i-th element is made by v[i]; the [] brackets 
indicate subscripting. (There is no Fortran-like ambiguity 
between subscripts and function call.) 


sum = 0; 

i= 0; 

while (i<=n) sum =+ v[itt]; 
This code sums the elements of the vector v. Notice the increment 
within the subscript - this is common usage in B. Similarly, to 
find the first zero element of v, 


S13 


while(v{++i] ) 


1. External Vectors 


Vectors can of course be external variables rather than auto. We 
declare v external within the function (don’t mention a size) by 


extrn v3 


and then outside all functions write 
v{10] ; 


External vectors may be initialized just as external simple 
variables: 


v({10] ‘hil’, 1, 2, 3, 0777; 


sets v[0O] to the character constant ‘hil’, and v[1] through v[4] 
to various numbers. v[5] through v[10] are not initialized. 


22. Addressing 


To understand all the ramifications of vectors, a digression into 
addressing is necessary. The actual implementation of a vector 
consists of a word v which contains in its right half the address 
of the O-th word of the vector; i.e., v is really a pointer to 
the actual vector. 


This implementation can lead to a fine trap for the unwary. If 
we say 


auto u[10], v[i0]; 
us V 


v[o] 
v[1] 


i il we 


oee 


the unsuspecting might believe that u[0] and u[1] contain the old 
values of v[0O] and v[1]; in fact, since u is just a pointer,to ,, 
v({O], u actually refers to the new content. The statement u=v 
does not cause any copy of information into the elements of u; 
they may indeed be lost because u no longer points to them. 


The fact that vector elements are referenced by a pointer to the 
zeroeth entry makes subscripting trivial - the address of vli] is 
simply vti. Furthermore, constructs like v[i][j] are quite 
legal: v[i] is simply a pointer to a vector, and v[i](j] is its 
j-th entry. You have to set up your own storage, though. 


To facilitate manipulation of addresses when it seems advisable, 
B provides two unary,address operators, ‘*’ and ‘’&’. ‘&" is the 
address operator so &x is the address of x, assuming it has 
one. ’*’ is the indirection operator; *x means use the con- 
tent of x as an address.” 


Subscripting operations can now be expressed differently: 


x(0] is equiyalent to *x 
x[1] Pe Oa 


&x [Oo] " x 
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23. Strings 


A string in B is in essence a vector of characters. It is written 
as 


“this is a string’ 


and, in internal representation, is a vector with characters 
packed four to a word, left justified, and terminated by a mys- 
terious character known as ‘*e’ (ascii EOT). Thus the string 


"hello" 


has six characters, the last of which is ‘’*e’. Characters are 
numbered from zero. Notice the difference between strings and 
character constants. Strings are double-quoted, and have zero or 
more characters; they are left justified and terminated by an 
explicit delimiter, “*e’. Character constants are single-quoted, 
have from one to four characters, and are right justified and 
zero-filled. 


Some characters can only be put into strings by “escape se- 
quences (-—>Appendix B). For instance, to get a double quote 
into a string, use ’* °, as in 


"He said, *"Let’s go.* 


External variables may be initialized to string values simply by 
following their names by strings, exactly as for other constants. 
These definitions initialize a, b, and three elements of v: 


a "hello"; b “world”; ,, ‘ 
v(2], now is the time , for all good men , 
to come to the aid of the party ; 


B provides several library functions for manipulating strings and 
accessing characters within them in a relatively machine- 
independent way. These are all Giscussed in more detail in. sec~ 
tion 8.2 of [1]. The two most commonly used are char and lchar. 
char(s,n) returns the n-th character of s, right justified and 
zero-filled. I1char(s,n,c) replaces the n-th character of s by c, 
and returns c. (The code for char is given in the next section. ) 


Because strings are vectors, writing "sit=s2. does not copy the 
characters in s2 into s1, but merely makes si point to s2. To 
copy s2 into si, we can use char and lchar in a fun¢tion like 
strcopy: 


strcopy(s1,s2){ 
auto i; 
i= 0; 
while (lchar(s1,i,char(s2,i)) I= °’*e’) i++; 


- 16 - 


24. Return statement 


How do we arrange that a function like char returns a value to 
its caller? Here is char: 


char(s,n){ 
auto y,sSh,cpos; 


y = s[n/4]; /* word containing n-th char */ 
cpos = n%4; /* position of char in word */ 

sh = 27-9*cpos; /* bit positions to shift */ 

y = (y>>sh)&0777; /* shift and mask all but 9 bits */ 
return(y); /* return character to caller */ 


A return in a function causes an immediate return to the calling 
program. If the return is followed by an expression in 
parentheses, this expression is evaluated, and the value returned 
to the caller. The expression can be arbitrarily complex, of 
course: char is actually defined as 


char(s,n) return((s[n/4]>>(27-9*(n%4) ) )&0777) ; 


Notice that char is written without {}, because it can be ex- 
pressed as a simple statement. 


25. Other String Functions 





concat(s,a1,a2,...,a10) concatenates the zero or more strings ai 
into one big string s, and returns s (which means a pointer to 
s(O0]). s must be declared big enough to hold the result (includ- 
ing the terminating ’*e’). 


getstr(s) fetches the entire next line from the input unit into 
Ss, replacing the terminating ‘*n’ by “’*e’, In B, it is 


getstr(s){ 
auto c,i; 
i-=/0% : 
while ((c=getchar()) != ‘*n’) lchar(s,it+tt,c); 
lchar(s,i,’*e’); 
return(s)3 


Similarly putstr(s) puts the string s onto the output unit, ex- 
cept for the final ’*e’. Thus 


auto s[20]; 
putstr(getstr(s)); putchar(’*n’); 


copies an input line to the output. 
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26. Switch statement; break 


loop: 
switch (c=getchar()){ 


pr: 
case ‘p’: /*' print */ 
case ‘PP’: 
print(); 
goto loop; 


case 
case 
subs ( 


‘ /* subs */ 
) 
goto p 


’ 
s 
’s 


Hy we cf co 


case ‘dad’: /* delete */ 
case ‘D’: 
delete(); 


goto loop; 


case ’*n’: /* quit on newline */ 


default; 
error(); /* error if fall out */ 
goto loop; 


This fragment of a text-editor program illustrates the switch 
statement, which is a multi-way branch. Here the branch is con- 
trolled by a Character read into c. If c is “p’ or ’P’, function 
print is invoked. If cis an ‘s’ or ‘S", subs and print are both 
invoked. Similar actions take place for other values. If c does 
not mention any of the values mentioned in the case statements, 
the default statement (in this case, an error function) is in- 
voked. 





The general form of a switch is 
switch (expression) statement 


The statement is always complex, and contains statements of the 
form 


case constant: 


The constants in our example are characters, but they could also 


be numbers. The expression is evaluated, and its value matched 
against the possible cases. If a match is found, execution con- 


tinues at that case. (Execution may of course fall through from 
one case to the next, and may transfer around within the switch, 
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or transfer outside it.) Notice we put multiple cases on several 
Statements. 


If no Match is found, execution continues at the statement la- 
beled “default: if it exists; if there is no match and no 
default, the entire statement part of the switch is skipped. 


The break statement forces an immediate transfer to the next 
Statement after the switch statement, i.e., the statement after 
the {} which enclose the statement part of the switch. break may 


also be used in while statements in an analogous manner. break 
is a useful way to avoid useless labels. 





27. Formatted IO 


In this section, we point to the function printf, which is used 
for producing formatted IO. This function is so useful that it 
is actually placed in Appendix C for ready reference. 


28. Lots More on IO 


File IO is described in much more detail in [1], section 8. This 
discussion does only the easy things. 


Basically, all Io in B is character-oriented, and based on 
getchar and putchar. By default, these functions refer to the 
terminal, and IO goes there unless explicit steps are taken to 
divert it. The diversion is easy and systematic, so IO unit 
switching is quite feasible. 


At any time there is one input unit and one output unit; getchar 
and putchar use these for their input and output by implication. 
The unit 1S a number between -1 and 10 (rather like Fortran file 
numbers). The current values are stored in the external vari- 
ables rd.unit and wr.eunit so they can be changed easily, although 
most programs need not use them directly. When execution of a B 
program begins, rd.unit is 0, which means read from terminal , 
and wreunit is 1, meaning write on terminal . 


To do input from a file, we set rd.unit to a value, and assign a 
file name to that unit. Then, until the assignment or the value 
is changed, getchar reads its characters from that file. Thus, 
to open for reading an ascii permanent file called "cat/file’, as 
input unit 5, 





openr(5,"cat/file”’) 


The first argument is the unit number, and the second is a 
cat/file description. (An empty string means the terminal.) 
rd.unit is set to 5 by openr so successive calls to getchar, 


=, oe 


etstr, etc., will fetch characters from this file. If the file 
is non-existent or if we read to end of file, all successive 
calls produce ‘*e’,. 
Writing its much the same; 


openw(u,s) 


opens ascii file s as unit u for writing. Successive calls to 
putchar, putstr, printf, etc., will place output on file s. 





If the IO unit mentioned in openr or openw is already open, it is 
closed before the next use. That is, any output for output files 
is flushed and end of file written; any input from input files is 
thrown away. Files are also closed this way upon normal termina- 
tion. 


Programs which have only one input and one output unit open 
simultaneously need not reference rd.unit and wr.unit explicitly, 
since the opening routines will do it automatically. Otherwise, 
don’t forget to declare them in an extrn statement. 


The function flush closes the current output unit without writing 
a terminating newline character. If this file is the terminal, 
this is the way to force a prompting question before reading a 
response on the same line: 





putstr("old or new?" ); 
flush(); 
getstr(s); 


prints 
old or new? 


and reads the response from the same line. 


File names may only be "cat/file’, "/file", or "file": no permis- 
sions, passwords, subcatalogs or alternate names are allowed. 
File names containing a ’/’ are assumed to be permanent files; if 
you try to open a permanent file which is already accessed, it is 
released and reaccessed. If it can’t be found, the access fails. 
A file name without ’/’ may or may not be a permanent file, so if 
a file of this name is already accessed, it is used without ques-— 
tion. If it is not already accessed, a search of the user’s 
catalog is made for it. If it can’t be found there either, and 
writing is desired, a temporary file is made. . 


The function reread is used to re-read a line, Its primary func— 
tion is that if it is called before the first call to getchar, it 
retrieves the command line with which the program was called. 


reread(); 
getstr(s); 
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fetches the command line, and puts it into string s. 


29. System Call from TSS 


— 





TSS users can easily execute other TSS subsystems from within a 
running B program, by using the library function system. For 
instance, 


system("./rj (w) ident;file’); 
acts exactly as if we had typed 
-/crj (w) ident;file 


in response to "SYSTEM?'. The argument of system is a string, 
which is executed as if typed at “SYSTEM?” level. (No ‘*n’ is 
needed at the end of the string.) Return is to the B program 
unless disaster strikes. This feature permits B programs to use 
TSS subsystems and other programs to do major functions without 
bookkeeping or core overhead. 


We can also use printf (->Appendix Cc) to write on the “system” Io 
unit, which is predefined as -1. All output written on unit —1 
acts as if typed at SYSTEM? level. Thus, 


extrn wr.unit; 
wreunit = —13 
opts = (w) 3) 
fname = file ; 

printf( ./rj %¥s ident;%s*n ,opts,fname); 


will set the current output unit to -1, the system, and then, 
print on it using printf to format the string. This example does 
the same thing as the last, but the RUNJOB options and the file 
name are variables, which can be changed as desired. Notice that 
this time a ’*n’ is necessary to terminate the line. 


Because of serious design failings in TSS, there is generally no 
decent way to send more than the command line as input to a pro- 
gram called this way, and no way to retrieve output from it. 


30. Recursion 


putnumb(n) { 
auto a3 
if(a=n/10) /* assignment, not equality test */ 
“putnumb(a); /* recursive */ 
putchar(n%10 + °0’); 


= 37 = 


This simplified version of the library function putnumb illus— 
trates the use of recursion. (It only works for n>0.) a is set 
to the quotient of n/10; if this is non-zero, a recursive call on 
putnumb is made to print the quotient. Eventually the high-order 
digit of n will be printed, and then as each invocation of put- 
numb is popped up, the next digit will be added on. 


31. Argument Passing for Functions 


There is a subtlety in function usage which can trap the un- 
suspecting Fortran programmer. Argument passing is call by 
value , which means that the called function is given the actual 
values of its arguments, and doesn’t know their addresses. This 
makes it very hard to change the value of one of the input 
arguments! 


For instance, consider a function flip(x,y) which is to inter- 
change its arguments. We might naively write 


flip(x,y){ 
auto t3 


tou 


rahe et 
rh x 


but this has no effect at all. (Non-believers: try it!) What we 
have to do is pass the arguments as explicit addresses, and use 
them as addresses. Thus we invoke flip as 

flip(&a,&b); 
and the definition of flip is 


flip(x,y){ 


auto t; 
t= *y;3 
*¥x = Fy; 
ty = t; 
i 


(And don’t leave out the spaces on the right of the equal signs.) 
This problem doesn’t arise if the arguments are vectors, since a 
vector is represented by its address. 


32. Calling Fortran and GMAP Subroutines from B Programs 


As we mentioned, B doesn’t have floating point, multi-dimensional 
arrays, and similar things that Fortran does better. A B pro- 
gram, callf, allows B programs to call Fortran and GMAP 


Poe Le Ree 


subroutines, like this: 
callf(name,ai,a2,...,a10) 


Here name is the Fortran function or subroutine to be used, and 
the a’s are the zero or more arguments of the subroutine. 


There are some restrictions on callf, related to the "call by 
value problem mentioned above, First, variables that are not 
vectors or strings must be referenced by address: use "8x" in- 
stead of x as an argument. Second, no constants are allowed 
among the arguments, because the construction &constant is 
illegal in B. We can’t say 


tod = callf(itimez,&0) 
to get the time of day; instead we have to use 


t= 0; 
tod = callf(itimez,&t); 


Third, it is not possible to return floating-point function 
values from Fortran programs, since the results are in the wrong 
register. They must be explcitly passed back as arguments. Thus 
we can’t say 


s = callf(sin,&x) 
but rather 
callf(sin2,&x,&s) 
where sin2 is defined as 
subroutine sin2(x,y) 
y = sin(x) 
return 
end 
Fourth, calif isn’t blindingly efficient, since there’s a lot of 
overhead converting calling sequences. (Very short GMAP routines 


to interface with B programs can often be coded quite efficiently 
- see [1], section 11.) 
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Appendix A: B Compiler Diagnostics 


Diagnostics consist of two letters, an optional name, anda 


source line number. 


Because of the free form input, the source 


line number may be too high. 


name 


name 
keyword 
name 


meaning 


{} imbalance 

() imbalance 

/**/ imbalance 

[] imbalance 

case table overflow (fatal) 

expression stack overflow (fatal) 

label table overflow (fatal) 

symbol table overflow (fatal) 

expression syntax error 

address expected 

name redeclaration —- multiple definition 
statement syntax error 

undefined name —- line number is last line 
of function where definition is missing 
syntax error in external definition 
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Appendix B: Escape Sequences and Typography 


Escape sequences are used in character constants and strings to 
obtain characters which for one reason or another are hard to 
represent directly. They consist of “*-’, where ‘-’ is as below: 


#0 null (ascii NUL = 000) 

*e end of string (ascii EOT = 004) 
* ( 

*) } 

et tab 

4 # 

HH! ¢ 

e" w 

¥n newline 


Users of non-ascii devices like model 33/35 Teletypes or IBM 
2741°s should use a reasonably clever text-editor like QED to 
create special characters for B programs. 


es 


Appendix C: printf for Formatted Io 


printf(fmt,a1,a2,a3,...,a10) 


This function writes the arguments ai (from 0 to 10 of which may 
be present) onto the current output stream under control of the 
format string fmt. The format conversion is controlled by two- 
letter sequences of the form ‘°%x’ inside the string fmt, where x 
Stands for one of the following: 


-- character data (ascii) 
decimal number 

-- octal number 

~- character string 


noma 
\ 
| 


The characters in fmt which do not appear in one of these two 
character sequences are copied without change to the output. 


Thus the call 

printft("¥d + %o is %s or %¥c%#n",1,-1, zero ,’0O’); 
leads to the output line 

1 + 777777777777 is zero or O 

As another example, a permanent file whose name is in the string 
s can be created with size n blocks and general read permission 
by using printf together with the system output unit (->29) and 
the filsys subsystem: 


wreunit= —13 “ 
printf( filsys cf %s,b/%d/,r*n ,s,n)3 
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Appendix D: Binding Strength of Operators 


Operators are listed from highest to lowest binding strength; 
there is no order within groups. Operators of equal strength 
bind left to right or right to left as indicated. 


—— * & = 1! ™~ (unary) [RL] 
/ % (binary) [LR] 
- [LR] 


<< [LR] 
- <= >= 


+ 


MN VY + e+ 
Vv 


[LR] 


v- 
em 

a 

na 
a) 


=+ =- etc, (all assignment operators) [RL] 


Name 





addressing 

address operator (&) 

arguments 

arithmetic 

arithmetic operators 
binding strengths 


assignment expressions 


assignments, multiple 
assignment operators 
auto declaration 


binding strengths 
bit operators 
./BJ (compiling B) 


Appendix E: Index 


Section 


22 
22 
2,31 
4,5 
5 

D 
14 
14 
19 
4 


18 
3 


callf (calling Fortran) 32 


characters (’...’) 
close 

comments (/*...*/) 
compiling B programs 
compound stat ({...}) 
concat 
conditionals 
conditional expr (?:) 


declarations 
default 
diagnostics (errors) 


else 

escape sequence (*x) 
exit 

external vectors 
extrn declaration 


file IO 

£lush 

formatted IO (printf) 
Fortran 
functions 
function values 


getchar 
getstr 
GMAP 


goto 


26 
23 
6 

28 
11 
3 

12 
25 


11,14,16,26 


16 


4 
26 
3,A 


16 
B 
15 
21 
7 


28 
28 
Cc 

32 


3,8,13,24,30 


24 


9 

25 
32 
11 


Name 





if 

index 

indirection op (*) 
Io (file) 

IO (formatted) 

IO (character) 

IO (string) 


labels 
layout of B programs 
ichar 


main 


newline (*n) 
null statement 


octal numbers 


‘openr, openw 


printf 


-putchar 


putnumb 
putstr 


rd.unit (file I0) 
recursion 


relational ops (==,...) 


reread 
return 
running B programs 


statement format 
strings ( eee 
subscripts 
switch 


TSS system call 
types (typeless) 


unary operators (++,--) 


variable names (syntax) 


vectors 


while 
wreynit (file Io) 


Section 


11 
E 

22 
28 


Cc 
6,9 
25 


20,22 


17 


4 
20,21,22 


14,15 
28 


